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Despite significant progress in the identification of genetic loci for age-related macular degeneration (AMD), 
not all of the heritability has been explained. To identify variants which contribute to the remaining genetic 
susceptibility, we performed the largest meta-analysis of genome-wide association studies to date for 
advanced AMD. We imputed 6 036 699 single-nucleotide polymorphisms with the 1000 Genomes Project 
reference genotypes on 2594 cases and 4134 controls with follow-up replication of top signals in 5640 
cases and 52 174 controls. We identified two new common susceptibility alleles, rsl 999930 on 6q21-q22.3 
near FRKICOL10A1 [odds ratio (OR) 0.87; P= 1.1 x 10"^] and rs4711751 on 6p12 near VEGFA (OR 1.15; 
P= 8.7x10"^). In addition to the two novel loci, 10 previously reported loci in ARMS2/HTRA1 
(rsl 0490924), CFH (rs1 061 170, and rsl 41 0996), CFS (rs641 153), C3 (rs22301 99), C2 (rs9332739), CFI 
(rsl 0033900), L/PC (rs1 046801 7), r/MP3 (rs9621 532) and CETP (rs3764261) were confirmed with genome- 
wide significant signals in this large study. Loci in the recently reported genes ABCA1 and C0L8A1 were 
also detected with suggestive evidence of association with advanced AMD. The novel variants identified in 
this study suggest that angiogenesis {VEGFA) and extracellular collagen matrix {FRK/COL10A1) pathways 
contribute to the development of advanced AMD. 



INTRODUCTION 

Advanced age-related macular degeneration (AMD) (MIM 
603075) is a leading cause of visual impairment and blindness 
in people older than 60 years. AMD is a common, late-onset 
disease that is modified by covariates including smoking and 
body mass index and has recurrence ratios for siblings of a 
case that are 3-6-fold higher than in the general population 
(1). The burden of this disease is increasing among the 
growing elderly population. Among individuals aged 75 or 
older, approximately one in four have some sign of this 
disease and about one in 15 have the advanced form with 
visual loss (2). There are two main forms of advanced 
AMD. The neovascular (NV), or 'wet', form is characterized 
by in-growth of choroidal vessels under the retina. Geographic 
atrophy (GA), the advanced 'dry' form of the disease, occurs 
when there is full thickness loss of the outer retinal layers, 
retinal pigment epithelium (RPE) and choriocapillaris in the 
central macula. Although anti-vascular endothelial growth 
factor (VEGF) therapy has significantly improved the func- 
tional and morphological outcomes for patients with NV 
disease (3), there are currently no effective therapies or pre- 
ventive strategies for GA. 

Several genetic loci have been associated with advanced 
AMD, including complement pathway genes CFH (4-9), C2 
(8,10), CFB (8,10), C3 (11), CFI (12) and the ARMS 2 /HTRAl 
(13,14) region. Recent genome-wide studies in large cohorts 
have also identified the association between advanced AMD 
and variants in LIPC ( 1 5), a gene in the high-density lipoprotein 
(HDL) pathway, and TIMP3 (16), and suggested association 
with other loci in the HDL pathway. The discovery of the mul- 
tiple associations with complement-related genes revealed an 



unanticipated central role for this pathway in disease pathogen- 
esis. This has led directly to the initiation of multiple clinical 
trials of drugs that alter the complement pathway in AMD 
patients (17). A combined risk score including these multiple 
genetic loci along with demographic, environmental and 
macular characteristics which modify risk is highly predictive 
of progression from the early and intermediate stages of AMD 
to the advanced stages which cause visual loss (18,19). 

The genetic variants known to date are estimated to account 
for <50% of the heritability of the disease (8,20). To identify 
additional loci that contribute to the genetic risk of advanced 
AMD and to illuminate new candidate physiological processes 
that might be involved, we performed a meta-analysis of 
genome-wide association study (GWAS) for advanced AMD 
that consisted cases/controls from the Tufts/Massachusetts 
General Hospital (MGH) GWAS Cohort Study (15), the 
Michigan, Mayo, Age-Related Eye Disease Study (AREDS), 
Pennsylvania (MMAP — Michigan, Mayo, AREDS, Pennsyl- 
vania Cohort Study) Cohort Study (16), as well as controls 
from the Myocardial Infarction Genetics Consortium 
(MIGen) (21) and the Genetic Association Information 
Network (GAIN) Schizophrenia Study (22). We imputed a 
large number of single-nucleotide polymorphisms (SNPs) 
using the 1000 Genomes Project reference data to search 
deeply throughout the genome in this large merged data set 
of Tufts/MMAP/MIGen/GAIN (TMMG). We then sought 
direct replication of the top representative SNPs of each 
clumped region in 10 independent cohorts from Johns 
Hopkins University (JHU), Columbia University (COL), Gen- 
entech, deCODE (Iceland), Washington University (Wash-U), 
Centre for Eye Research Australia (AUS), the Rotterdam 
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Study (RS), an independent replication sample from Tufts/ 
MGH, Hopital Intercommunal de Creteil (FR-CRET) and 
The Queen's University of Belfast (Irish). We also conducted 
a combined analysis for the results of top SNPs in all 
participating cohorts using a fixed effects model. 



RESULTS 

After the quality control analyses (see Materials and Methods; 
Supplementary Material, Table SI), the TMMG data set con- 
sisted of genotype data for 2594 individuals with advanced 
AMD and 4134 controls, all of European ancestry. A set of 
6 036 699 high-quality SNPs from imputation using the 1000 
Genomes Project data was tested for the association with 
advanced AMD. We plotted our meta-analysis of GWAS 
/"-values in quantile-quantile plots. The strong associations 
of previously reported SNPs distorted the P-values distribution 
toward the top-end of the plot (Supplementary Material, 
Fig. SIA). After removing these well-validated associated 
loci, we observed little statistical inflation in the remaining 
distribution of association statistics (inflation factor Ag^ = 
1.047; Supplementary Material, Fig. SIB). Since inflation 
factor scales with sample size, we estimated the value that 
would be expected in a study of 1000 cases and 1000 controls 
(Aiooo = 1-015). Again, there was little evidence of any 
general inflation of the test statistics. As expected, we 
observed highly statistically significant association signals at 
SNPs in six previously published loci, including ARMS2/ 
HTRAl (rsl0490924, P=\.2y. \Q~^^\ CFH (rsl061170, 
P=5.6 X 10"'^**, and rsl410996, P = 2.\ x 10"'^"^), CFB 
(rs641153, P = 2.9 x 10"^^), C3 (rs2230199, P=1.4x 
10"'^), C2 (rs9332739, P=4.3 x 10"^^), CF/ (rs 10033900, 
P = 2.4xl0"") and LIPC (rsl532085, P= 1.0x10"^) 
(Fig. l). 

In addition to the previously identified loci, we detected a 
region at 6q21-q22.3 (Fig. 2 A) that contained 30 SNPs in 
tight LD {R^ > 0.8) which were strongly associated with 
AMD status in the TMMG sample (P < 5 x 10"^). The 
associated region contains the genes COLlOAl (encoding the 
alpha chain of type X collagen) and FRK (encoding the fyn- 
related kinase). To confirm the new locus for advanced 
AMD, we selected two SNPs rsl2204816 {P = 1.73 x 10"^ 
near COLlOAl) and rsl999930 (P=3.1 x 10"^ between 
FRK and COLlOAl) from this block for further replication 
study. In addition to the FRKICOLlOAl variants, we also 
sought to replicate 37 other previously unreported candidate 
loci (P < 5 X 10~^ in the TMMG meta-analysis), as well as 
previously reported loci. 

In aggregate, the replication data sets consisted of 5640 
cases and 52 174 controls from 10 independent cohorts from 
JHU, COL, Genentech, Iceland, Wash-U, AUS, RS, 
FR-CRET, Irish and an independent replication sample from 
Tufts/MGH (Supplementary Material, Table S2). The effec- 
tive sample sizes of each cohort are noted in Supplementary 
Material, Table S3. Of the two SNPs we selected for replica- 
tion in FRK/COLlOAl locus, rs 122048 16 failed the genotyp- 
ing quality criteria in the replication phase, but rs 1999930 
was successfully genotyped in all 10 replication cohorts. In 
the TMMG meta-analysis, the minor T allele frequency of 



rsl999930 was 26% in cases and 30% in controls (Table 1), 
with an odds ratio (OR) of 0.81 and a 95% confidence interval 
(CI) range of 0.74-0.88 (Fig. 2B; Supplementary Material, 
Table S3). Combining the effect sizes of all independent repli- 
cation cohorts using a fixed effects model confirmed the 
association (OR = 0.90, P = 8.3 x lO""*). In the combined 
analysis of all the samples, the T allele of rs 1999930 signifi- 
cantly {P = 1.1 X 10~ ) reduced the risk of advanced AMD 
[OR = 0.87 (95% CI: 0.83-0.91)]. There was no significant 
evidence for heterogeneity under Cochran's g-test {P = 
0.32, = 15%) across data sets. 

Another previously unreported locus (rs47 11751) 
near VEGFA with a suggestive association signal {P = 2.2 
X 1 0~ ^) in the TMMG meta-analysis was confirmed in our repli- 
cation study. The T allele of rs47 1 1 75 1 , with an allele frequency 
of 0.54 in cases and 0.50 in controls, was associated with 
increased risk of advanced AMD [0R= 1.21 (95% CI: 1.11- 
1 .32)]. The results were consistent in direct replication genotyp- 
ing in an independent set of 5419 cases and 47 687 controls 
[OR =1.13 (95% CI: 1.06-1.19), P = 4.3 x 10"^]. This SNP 
reached genome-wide significance [0R= 1.15 (95% CI: 
1.10-1.21), P= 8.7 X 10"''] in the combined analysis 
(Fig. 2C and D; Supplementary Material, Table S4), including 
all replication cohorts except the Rotterdam Study, in which 
rs47 11751 was not genotyped. We found no significant evidence 
for heterogeneity (P = 0.26, 7^ = 24%) for the rs4711751 
association results across the nine cohorts tested. 

Besides the two novel FRK/COLlOAl and VEGFA loci, 
three recently reported loci were also associated with 
advanced AMD (Table 1). The risk variants in T1MP3 
(rs9621532, P = 2.2 x 10"'^) and HDL pathway genes 
LIPC (rsl0468017, P=2.7 x 10"^^) and CETP (rs3764261, 
P = 6.9 X 10~^) reached genome-wide significance in the 
combined analysis. Two other variants in ABCAl 
(rsl883025, /■= 1.2x10"^) and COL8A1 (rsl3095226, 
P = 9.1x 10~^) which were reported in our previous 
GWAS (15) are also still noteworthy candidates (Supplemen- 
tary Material, Table S5). Supplementary Material, Table S6, 
shows other published candidate SNPs which were not associ- 
ated with advanced AMD in this GWAS meta-analysis. 

We also investigated the specific association with GA and 
NV subtypes of AMD in our TMMG samples. The minor 
allele (T) of rsl999930 had a similar effect size for GA 
[OR = 0.78 (0.69-0.89), P=\.Qx 10""^] and NV [0R = 
0.82 (0.75-0.90), P = 4.1 x 10"^]. The risk allele (T) of 
rs4711751 also had a similar magnitude of effect on GA 
[OR= 1.23 (1.08-1.40), P = 2.Qx 10"^] and NV [0R = 
1.20 (1.09-1.32), P = 2.5 X 10"'']. Association signals at 
CFH, C2, CFB, C3, CFI and ARMS2/HTRA1 were also signifi- 
cant for both GA and NV compared with controls. ARMS2/ 
HTRAl was more strongly related to NV compared with GA 
as previously reported (23). 

This study provides an opportunity to establish a prediction 
model for advanced AMD with all the associated genetic risk 
factors combined together. We evaluated a risk score based on 
the sum of the genotype dosage of 14 risk variants (SNPs in 
Table 1 plus rsl883025 in ABCAl and rsl3095226 in 
COLS A 1 in Supplementary Material, Table S5) which were 
validated or suggested in this study, each weighted by the 
natural logarithm of OR estimated by a multivariate logistic 
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Figure 1. Manhattan plot. Log (P) values of association results from the cleaned TMMG data set are plotted for SNPs on each chromosome. SNPs withP < 5 x 
10"^ are colored in red and the representative genes for each associated region are labeled. 



regression model of these 14 variants in TMMG samples. It is 
estimated that there is a > 50-fold difference in advanced 
AMD risk between the high-risk individuals (risk score >2) 
and the low-risk individuals (risk-score < — 2) (Supplementary 
Material, Fig. S2). 



DISCUSSION 

In this study aiming to find new genetic factors for advanced 
AMD, we report a genome-wide significant association near 
FRK/COLlOAl (rsl999930, P = 1.1 x 10"**), a locus not 
previously implicated in advanced AMD. We also identified 
a novel locus (rs4711751, P=8.7x 10~^) for advanced 
AMD near VEFGA. In addition, we confirmed strong associ- 
ation with the previously reported genetic variations at 10 
loci including ARMS2/HTRA1 (rsl0490924, P=3.6x 
10"^^^), CF// (rs 1061 170, P= 1.3 x 10"^", and rsl410996, 
P= 7.4 X 10"^^^), CFB (rs641153, P = 5.5 x 10"^'), Ci 
(rs2230199, = 4.6x10"^% C2 (rs9332739, P = 2.4x 
10"^^), CFI (rsl0033900, P = 4.1 x 10"'°), LIPC 
(rsl0468017, P = 2.7 x 10"'^), TIMP3 (rs9621532, P = 
2.2 X 10"'^) and CETP (rs3764261, P= 6.9 x 10"'') in the 
combined analysis. Our analyses also support previously 
identified loci mABCAl and COL8A1. 

The estimated heritability based on twin studies is 71% 
for advanced forms of this disease (24). Using a standard 
liability threshold model (25), the previously reported loci 
combined with the new loci discovered in this study 
explain ~39% of the total variance (or 55% of the heritabil- 
ity) of advanced AMD. Therefore, there are still unidentified 
genetic variants that may explain the missing heritability. 
Additional AMD risk variants likely remain to be discovered 
and will require a combined strategy of larger AMD meta- 
analyses to detect variants of more modest effect, genome 
scans using higher density SNP arrays to capture previously 
missed variants and exome-sequencing studies to identify 
rare variants. 



VEGFA is a member of the VEGF family and functions to 
increase vascular permeability, angiogenesis, cell growth and 
migration of endothelial cells. VEGFA is the target for mul- 
tiple therapies including ranibizumab, a molecule that is 
FDA-approved for the treatment of wet AMD. It has been 
hypothesized that activation of VEGFA may induce patholo- 
gic angiogenesis beneath the RPE layer. The newly identified 
SNP (rs4711751) is 60 kb downstream of VEGFA and >90 kb 
away from a SNP (rs20 10963) in the VEGFA promoter region 
which has been reported to be associated with AMD (26). 
However, SNP rs4711751 appears to be independent of the 
rs20 10963 variant {R^ = 0.015, Z)' = 0.14 in samples of Euro- 
pean ancestry); therefore, the association we identified near 
VEGFA was in a novel region and is not likely due to LD 
with SNPs in the VEGFA promoter region. Of note, the pre- 
viously reported rs20 10963 SNP showed no evidence of 
association in the TMMG meta-analysis {P = 0.26) (Sup- 
plementary Material, Table S6). In addition, rs4711751 is in 
moderate LD with nearby genome-wide significant variants 
reported in type 2 diabetes, waist-hip ratio and chronic 
kidney disease {R^= 0.31, D' = 0.91 to rs881858). However, 
rs881858 was not significantly associated with advanced 
AMD in the TMMG meta-analysis (P=0.11) and cannot 
explain the association we observe in rs47 11751. 

Finally, we note that the newly identified SNP rs471 1751 is 
in strong LD with rs943080 (R^ = 1.0 in 1000 Genomes CEU 
data), a variant that resides in a highly evolutionarily con- 
served region (Fig. 3). The risk allele (T) at rs4711751 is on 
the same haplotype as the evolutionarily conserved allele (T) 
at rs943080. Allelic change from T to C on this conserved 
region may disrupt a putative transcription factor-binding 
site for cone-rod homeobox (CRX), which is an essential 
transcription factor highly expressed in RPE and retinal 
ganglion cells (27). This suggests a possible mechanism for 
the candidate causal SNP rs943080. Individuals with the pro- 
tective allele (C) at rs943080 may have decreased binding of 
CRX at the locus, leading to decreased expression of 
VEGFA, which in turn protects these individuals from 
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Figure 2. FRK/COLlOAl and VEGFA regions and association with AMD. (A) Observed association in tlie 500 kb region surrounding the FRK/COLlOAl locus in 
meta-analysis of TMMG data sets. The representative SNP (rsl 999930) for this region with F = 3.1 x 10 ^ is shown by a small purple circle. In the combined analy- 
sis including all 1 1 cohorts, this SNP was associated with AMD at /" = 1.1 x 10 (large puiple diamond). (B) Forest plot for rsl999930 association across 1 1 
cohorts. (C) Observed association in the 500 kb region surrounding the VEGFA locus in meta-analysis of TMMG data sets. The represented SNP (rs47 1 1 75 1 ) 
for this region of P = 2.2 x 10 'is shown by a small purple circle. In the combined analysis including all 10 cohorts, this SNP was associated with AMD at 
P= 8.7 X 10 ' (large purple diamond). (D) Forest plot for rs471 1751 association across lOcohorts. 



development of neo-vascularization involved in wet AMD. 
This hypothetical mechanism needs future experimental 
validation. 

COLlOAl encodes the alpha chain of type X collagen, a 
short-chain collagen expressed by hypertrophic chondrocytes 
during endochondral ossification. In patients with osteoar- 
thritis, expression of COLlOAl was significantly downregu- 
lated (28). Another collagen matrix pathway gene 
(C0L8A1), which was implicated in our previous GWAS 
(15), also showed suggestive association to advanced AMD 
in our combined association analysis (P = 9.7 x 10~^). The 
C-terminal non-collagenous (NCI) domain of the collagen 
has been reported as an inhibitor of angiogenesis (29-31). 
FRK has also been shown to have negative function on the 



stimulation of microvascular survival of the developing 
retina by mediating the downstream signaling of 
thrombospondin-1 and the thrombospondin receptor (CD36), 
which has been shown to antagonize VEGFA signaling of 
the Akt pathway (32). The risk locus rsl999930 associated 
with advanced AMD in our study is in strong LD {R^ = 
0.81 in 1000 Genomes CEU data) with a functional variant 
rs9488843. The allele (G) at rs9488843, which creates a poss- 
ible transcription factor-binding site for paired box 3 (PAX3) 
near the promoter region of COLlOAl, is on the same haplo- 
type as the allele (T) at rsl 999930. Individuals with the protec- 
tive allele (G) at rs9488843 may have increased binding of 
PAX3 at the locus, leading to elevated expression of 
COLlOAl or FRK which results in the suppression or 



Table 1. Genes associated with AMD in genome-wide meta-analysis and analysis of all samples combined 
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Figure 3. rs934080 in a putative CRX transcription factor-binding site. rs47 1 1 75 1 is in strong LD witli rs934080, a variant wliich resides in a highly evolutionarily 
conserved region (UCSC genome browser) and disrupts a putative CRX transcription factor-binding site (CAA[T/C]C). 



inhibition of angiogenesis. Further experimental work is 
required to investigate the functional role of rs9488843 in 
the development of advanced AMD. 

The sample size of this study is the largest of all published 
association studies for advanced AMD to date. A major advan- 
tage of the study design is the careful diagnosis of cases across 
all cohorts. Since we included only subjects with advanced 
AMD in our study and excluded subjects with intermediate 
or large drusen, heterogeneity due to phenotype definition is 
reduced. However, it is possible that associations exist for 
other endophenotypes, like macular drusen, an early or inter- 
mediate stage of the disease, as suggested for loci in the 
HDL pathway (33). 

Our novel findings are not likely caused by population 
admixture or population substructure, because subjects in all 
cohorts are of European ancestry, and we adjusted for the 
genetic ancestry components in our study. The large number 
of replication cohorts and samples reduced the chance of false- 
positive findings. The effect sizes of both rs 1999930 and 
rs4711751 in the replication cohorts are smaller than the 
effect sizes estimated in the TMMG analysis. The larger 
effect size observed in the discovery cohort (TMMG) could 
be due to a 'winner's curse' phenomenon where association 
is often exaggerated relative to the estimated effect in 
follow-up studies (34). 

For this study, we utilized the generally accepted genome- 
wide level of significance (P < 5 x 10~^) as our threshold 
for association. However, that threshold assumes a multiple 
hypothesis testing burden of ~ 1 000 000 independent SNPs. 
Indeed, in our study, since we used the 1000 Genomes 
Project imputation data, there were many more individual 
SNPs tested. However, many of those SNPs are highly inter- 
correlated. To our knowledge, there are no empirical studies 
that address levels of genome-wide significance for the 1000 
Genomes Project-derived data. 



Our genetic risk score model provides a framework for 
future research, and the clinical utility of genetic risk profiling 
of advanced AMD needs to be further evaluated in indepen- 
dent samples. Compared with other complex diseases, the 
associated risk variants for advanced AMD are more informa- 
tive in terms of predicting risk of disease. As this prediction 
model only included genetic risk factors, we expect an 
improvement of the performance of advanced AMD risk 
assessment with additional environmental and demographic 
factors in prospective studies as in our previous calculations 
(18,19). 

In summary, we have identified two novel associations for 
advanced AMD nesLV FRK/COLlOAl and VEGFA. We also con- 
firmed associations for 10 previously published advanced AMD 
loci in a combined analysis. The genetic loci associated with 
AMD suggest that the disease process may be explained in 
part by dysregulation of the alternative complement pathway 
{CFH, C2, CFB, C5, CFI), HDL cholesterol metabolism 
{LIPC, CETP,ABCA1), angiogenesis (FEGF^) and degradation 
of extracellular matrix (COLlOAl, C0L8A1, FRK, TIMP3, and 
possMy ARMS2). 



MATERIALS AND METHODS 

The TMMG meta-analysis data set consisted of: (i) 1242 cases 
and 492 controls from the Tufts/MGH GWAS Cohort Study 
(15), which were derived from ongoing AMD study protocols 
as described previously (8,15,24,35-37); (ii) 1355 cases and 
1076 controls from the MMAP Cohort Study (16); (iii) 1188 
controls from the (MIGen) Consortium Study (21) and 
(iv) 1378 controls from the GAIN Schizophrenia Study (22). 
For the Tufts/MGH sample, cases had GA or NV disease 
based on fundus photography and ocular examination [clinical 
age-related maculopathy grading system (CARMS) stages 
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4 and 5] (38). Examined controls were unrelated to cases, 
60 years of age or older and were defined as individuals 
without macular degeneration, categorized as CARMS stage 
1, based on fundus photography and ocular examination. 
MMAP subjects were obtained and selected based on the 
dbGaP (phs000182.v2.pl) phenotype information (16). We 
included only MMAP controls and MMAP cases with GA or 
NV in the analysis; other MMAP subjects with large drusen 
were excluded. MIGen controls have been included in our pre- 
vious GWAS study and described in detail (15). Shared con- 
trols from the GAIN Schizophrenia Study were obtained 
from dbGap (phs000021.v3.p2) and described in Manolio 
et al. (22). 

The Tufts/MGH and MIGen samples were genotyped at the 
Broad Institute and National Center for Research Resources 
(NCRR) Center for Genotyping and Analysis, using the Affy- 
metrix SNP 6.0 GeneChip (AFFY 6.0, 909 622 SNPs) (39). 
Shared controls from the GAIN study obtained from dbGap 
were also genotyped by using the Affymetrix SNP 6.0 Gene- 
Chip. MMAP samples obtained from dbGap were genotyped 
on the Illumina HumanCNV370vl Bead Array (ILMN 370, 
370 404 SNPs) (16). All samples included in this study met 
quality control measures as described previously (15,16). 
Briefly, individuals with call rates <0.95, SNPs with call 
rates <0.98, Hardy-Weinberg equilibrium P < 10~* and 
minor allele frequency (MAF) <0.01 were excluded. Potential 
relatedness between individuals was identified through a 
genome-wide identity-by-state (IBS) matrix using PLINK 
(40). IBS was estimated for each pair of individuals, and one 
individual from each duplicate pair or related pair (pihat > 
0.2) was removed. Ancestry outliers were identified based on 
principal components analysis using EIGENSOFT (Sup- 
plementary Material, Fig. S3) (41). After these quality control 
analyses (Supplementary Material, Table SI), the merged 
data set of TMMG contained 6728 samples, of which 4300 
were genotyped by AFFY 6.0 and 2428 were genotyped by 
ILMN 370. The TMMG data set genotyped by AFFY 6.0 
(644 413 SNPs passing quality control checks) was imputed 
using the phased CEU and TSI samples (566 haplotypes) as 
part of Pilot 3 of the 1000 Genomes Project as a reference by 
BEAGLE version 3.0 (42,43). Separate imputation was per- 
formed on the TMMG data set genotyped on the ILMN 370 
(329 315 SNPs passing quality control checks) using the 
same method. For the meta-analysis of GWAS, we included 
only imputed genotypes with imputation quality scores >0.6, 
where the score is defined as the ratio-of-variances (empiri- 
cal/asymptotic) of each genotype. This score is commonly 
applied as a quality filter for imputed genotypes and is equival- 
ent to the RSQR_HAT value by MACH and the information 
content (INFO) measure by PLINK (44). Since the imputation 
accuracies are relatively low for SNPs with low MAF, we only 
included imputed genotypes of common variants (MAF >0.01) 
in the analysis. A consensus set of 6 036 699 high-quality SNPs 
from each imputed data set was analyzed by PLINK, using a 
generalized linear model controlling for the genotyping 
platform and genetic ancestry based on principal component 
analysis. The imputed genotypes were coded by the genotype 
probabilities (dosages) for each SNP, which were given less 
weights in the analysis than individuals with certain genotypes 
coded by (0, 1, 2). The eigenvector scores with nominal 



significant (P < 0.05) association with case/control status 
(principal components 1, 2, 3, 4, 5, 6, 7, 1 1 and 16) and the orig- 
inal genotyping platform were included as covariates in the 
analysis. The top 40 SNPs were validated using Sequenom gen- 
otyping on 1600 samples that were also part of the Tufts/MGH 
GWAS. The MAFs were compared for these SNPs and showed 
no significant differences between imputed and genotyped fre- 
quencies in cases or controls. 

The replication data sets consisted of 5640 cases and 52 174 
controls from 10 independent cohorts from JHU, COL, Genen- 
tech, Iceland, Wash-U, AUS, RS, FR-CRET, Irish and an 
independent replication sample from Tufts/MGH. All replica- 
tion studies applied the same criteria for the diagnosis of cases. 
Population and shared controls were included in Genentech, 
Iceland and the RS samples. All participating studies received 
approval from institutional review boards (IRBs) and con- 
formed to the tenets of the Declaration of Helsinki. All partici- 
pants signed informed consent as approved by IRBs. 
Characteristics of each participating cohort are shown in Sup- 
plementary Material, Table S2. Samples from FR-CRET, Irish 
and Tufts/MGH replication data sets were genotyped at the 
Broad Institute by the Sequenom iPLEX assay. Samples 
from Wash-U were genotyped at the Sequenom Core Labora- 
tory of Washington University. Samples from AUS were gen- 
otyped in-house and at the Murdoch Children's Research 
Institute Sequenom Platform Facility. Samples from JHU 
and COL were genotyped by the TaqMan assay, using the 
ABI PRISM 7900 Sequence Detection System (ABI, Foster 
City, CA, USA). For the SNPs we intended to replicate, we 
obtained directly genotyped or imputed results from Genen- 
tech, Iceland and RS samples. Genentech samples included 
54 non-overlapping cases and 229 controls from the AREDS 
cohort (genotyped using Illumina Human610-Quad), 347 
cases from a Genentech trial (Illumina Human660W-Quad), 
3390 controls from the SEE GWAS study (45) (Illumina 
HumanHap550), 2274 controls from the CGEMS breast 
cancer study (46) and 2256 controls from the CGEMS prostate 
cancer study (47). For candidate SNPs not directly genotyped 
in the Genentech samples, genotype information was imputed 
using IMPUTE version 2 (48) with combined reference data of 
CEU and TSI population from the 1000 Genomes Project 
(June 2010 release) and HapMap3 Project. The Iceland 
samples were genotyped using Illumina HumanCNV370vl 
Bead Array. Candidate SNPs not directly genotyped were 
imputed using IMPUTE version 2 with the reference data of 
CEU and TSI population from the 1000 Genomes Project 
(June 2010 release), HapMap2 Project (release 22) and a refer- 
ence data set of 500-1000 Icelanders genotyped using the 1 
million OmniQuad and CardioMetabo chips from Illumina. 
Owing to the larger size of the Icelanders data set, the imputa- 
tion is more reliable based on the Icelanders data set than the 
imputation based on the HapMap or 1000 Genomes Project. 
The Rotterdam Study samples were genotyped by Illumina 
Infinium II HumanHap550 (cases n = 192, controls n = 
1887) and Illumina Human610-Quad AiTay (cases n = 29, 
controls n = 2600). Candidate SNPs not directly genotyped 
were imputed using MACH 1.0 (49) with the reference data 
of CEU and TSI population from the HapMap2 project 
(release 22). Genotyping and imputation methods used by 
the Rotterdam Study samples have been described in detail 
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previously (50). Standard quality control and statistical analy- 
sis for these samples were performed by Genentech, Iceland 
and RS separately. SNPs which met genotype quality control 
criteria in other replication cohorts were tested for association 
with advanced AMD, using a generalized linear model in 
PLINK. We used an additive model for each SNP (0, 1 or 2 
minor alleles). The /"-value for the combined analysis was 
derived from the effect size estimates and standard errors, 
using a fixed effects model by METAL (51). Heterogeneity 
of the association between SNP and disease was evaluated 
by Cochran's g-test. 

SUPPLEMENTARY MATERIAL 

Supplementary Material is available at HMG online. 
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